340 research outputs found

    On the Suitability of Genetic-Based Algorithms for Data Mining

    Get PDF
    Data mining has as goal to extract knowledge from large databases. A database may be considered as a search space consisting of an enormous number of elements, and a mining algorithm as a search strategy. In general, an exhaustive search of the space is infeasible. Therefore, efficient search strategies are of vital importance. Search strategies on genetic-based algorithms have been applied successfully in a wide range of applications. We focus on the suitability of genetic-based algorithms for data mining. We discuss the design and implementation of a genetic-based algorithm for data mining and illustrate its potentials

    Computation of generalized inverses using Php/MySql environment

    Full text link
    The main aim of this paper is to develop a client/server-based model for computing the weighted Moore-Penrose inverse using the partitioning method as well as for storage of generated results. The web application is developed in the PHP/MySQL environment. The source code is open and free for testing by using a web browser. Influence of different matrix representations and storage systems on the computational time is investigated. The CPU time for searching the previously stored pseudo-inverses is compared with the CPU time spent for new computation of the same inverses.Comment: International Journal of Computer Mathematics, Volume 88, Issue 11, 201

    Performance Degradation and Cost Impact Evaluation of Privacy Preserving Mechanisms in Big Data Systems

    Get PDF
    Big Data is an emerging area and concerns managing datasets whose size is beyond commonly used software tools ability to capture, process, and perform analyses in a timely way. The Big Data software market is growing at 32% compound annual rate, almost four times more than the whole ICT market, and the quantity of data to be analyzed is expected to double every two years. Security and privacy are becoming very urgent Big Data aspects that need to be tackled. Indeed, users share more and more personal data and user-generated content through their mobile devices and computers to social networks and cloud services, losing data and content control with a serious impact on their own privacy. Privacy is one area that had a serious debate recently, and many governments require data providers and companies to protect users’ sensitive data. To mitigate these problems, many solutions have been developed to provide data privacy but, unfortunately, they introduce some computational overhead when data is processed. The goal of this paper is to quantitatively evaluate the performance and cost impact of multiple privacy protection mechanisms. A real industry case study concerning tax fraud detection has been considered. Many experiments have been performed to analyze the performance degradation and additional cost (required to provide a given service level) for running applications in a cloud system

    Formalising openCypher Graph Queries in Relational Algebra

    Get PDF
    Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the possibility of vendor lock-in. To avoid this threat, vendors are working on supporting existing standard languages (e.g. SQL) or creating standardised languages. In this paper, we present a formal specification for openCypher, a high-level declarative graph query language with an ongoing standardisation effort. We introduce relational graph algebra, which extends relational operators by adapting graph-specific operators and define a mapping from core openCypher constructs to this algebra. We propose an algorithm that allows systematic compilation of openCypher queries.Comment: ADBIS conference (21st European Conference on Advances in Databases and Information Systems) The final publication is available at Springer via https://doi.org/10.1007/978-3-319-66917-5_1
    corecore